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Continuing our investigations of quenched QCD with improved fermions we have started simulations for lattice 
size 32'^ X 64 at /3 = 6.2. We present first results for light hadron masses at k = 0.13520, 0.13540, and 0.13555. 
Moreover we compare our initial experiences on the T3E with those for APE/Quadrics systems. 



1. INTRODUCTION 

High computer costs turned out to be a major 
problem when performing quenched QCD simula- 
tions at smaller lattice spacing a. Improving the 
action in order to reduce cut-off effects therefore 
became an important goal. 

While standard gluonic action has discretiza- 
tion errors of O(a^), those for Wilson fermions 
are of 0{a). Sheikholeslami and Wohlcrt pro- 
posed a modified fcrmion action: 
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where Sp^ is the standard Wilson action and 
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as a function of was done by the Alpha coUab- 
oration |Q. 

Until now the QCDSF collaboration has pre- 
sented results for the hght hadron spectrum us- 
ing the improved action for two values of the cou- 
phng, 13 = 6.0 and 6.2 (see ||,||). These calcu- 
lations have been carried out on APE Quadrics 
computers on lattice sizes up to 24^^ x 48. In or- 
der to aUow a more reliable estimate of the chiral 
limit for /3 — 6.2 we started calculations with a 
hopping parameter k closer to the critical value, 
Kc, on a lattice of size 32"^ x 64. 

At present we have evaluated 0(75) configura- 
tions. We hope for higher statistics, and so the 
following results should be regarded as prelimi- 
nary. 
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^ {U{x)f,^ -U{x)l^). (2) 2. SIMULATION DETAILS 



If the coefficient csw of the so-called clover term 
is chosen appropriately, this action removes all 
0(a) errors from on-shell quantities like hadron 
masses. A non-perturbative calculation of csw 

*Poster presented by D. Pleiter at Lat97, Edinburgh, U.K. 



We perform the quenched QCD simulations at 
[3 — 6/ — 6.2. To generate a new gauge configu- 
ration we use 100 cycles consisting of a single 3- hit 
Metropolis sweep followed by 16 over-relaxation 
sweeps using the SU{3) algorithm suggested by 
Creutz il. 
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We use Jacobi smearing 

i for source and sink. 
We chose the number of smearing steps to be 
Ng = 100 and for the smearing hopping parame- 
ter we took Kg = 0.21, for which the radius of the 
smeared source ra is about 3.5a which roughly 
corresponds to 0.4/m. Although we have calcu- 
lated the propagator for both smeared and un- 
smeared sink, we will only use the results for 
smeared sink here. 

The simulations are performed for three differ- 
ent hopping parameters, k = 0.13520, 0.13540, 
0.13555, with clover coefficient csw = 1-614 cho- 
sen according to For the matrix inversion 
we mainly use BiCGstab The minimal 

residue algorithm is used in case BiCGstab does 
not converge. As convergence criterion we chose 
r < 10~-^^, where r = \Mx — (l>\/\x\- 

We found up to 4 configurations per k which 
show an exceptional pattern (see Fig. |l|). They 
have been excluded from the evaluation. 
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Figure 2. Effective mass of the Pion. 



/\ found to be rather different for improved and Wil- 

^ / V''^ - soil fermions at P — 6.0 As can be seen from 

/'j /^vii* ^^S- 1' the results at /3 = 6.2 seem to confirm that 

/ J ^/Yj** the improved results come closer to the physical 

\\-... /' / /' is* value than the Wilson results. 

\X y To see how masses scale as (3 is changed, it has 

- been suggested [|j that rap should be plotted in 

^%^">^ \^/ units of the square-root of the string tension K 

which has cut-off errors of O(a^) only. In Fig. 
the ratio ■mp/\/K is shown as a function of a\/K 
for fixed physical vr masses with — 0, 2K and 

AK . To obtain the p mass for these values of 

5 10 15 20 25 30 35 40 45 50 55 60 we extrapolated, or interpolated, vip using the 

Figure 1. Pion propagator at k = 0.13555 with sepa- phenomenological ansatz ||] 



rately plotted exceptional configurations. 



mx ^b^ + b2mi + h^m%, X = p,N (3) 



3. RESULTS 

Until now we have looked at the tt, p and nu- 
cleon masses. We find good plateaus when plot- 
ting the effective mass m{t) = ln[c{t)/c(t+ 1)], 
as shown in Fig. || for the tt. 

Plots of the dimensionless ratio niM/rnp as a 
function of {m.^/mpY (so-called APE plot) were 



4. T3E PERFORMANCE 

On the 32^ x 64 lattice our current program 
needs 13.1s per BiCGstab iteration step on a 
T3E with 128 DEC Alpha 21164 5/375MHz mi- 
croprocessors. Simulations done on a 256 node 
Quadrics QH2 need 6.3s for the same operations 
on a 24^^ x 48 lattice. Comparing the peak perfor- 



3 



1.60 - 




1.25 - 



1.20 [ ^ 

0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1.0 

Figure 3. APE plot at « = 6.2 for improved (□ |^) and 
Wilson fermions (o [ |lO|Jll| ). The new improved results 
on larger lattices are marked by X. This data can be 
compared with the mass ratio (*) at the physical quark 
mass and in the heavy quark limit. The solid line comes 
from a fit using the phenomenological ansatz, eq. M, with 
the new preliminary data included. 
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Figure 4. The ratio mp/y^X) as a function of the lat- 
tice pacing for Wilson fermions (o [0,^^) and improved 
(□ [pj and X, this work). This is compared with the ex- 
perimental value (*) using W (K) = AUTMeV . 



mance of both machines (T3E: 96 Gflops / QH2: 
12.8 Gflops) one would expect the T3E to do this 
job about twice as fast as the QH2 (although 
the calculations on the T3E are done in double 
precision). Since the communication overhead in 
the BiCGstab routines on the T3E is less then 
3%, this indicates a single processor performance 
problem. While lattice gauge theory applications 
on the QH2 typically reach sustained speeds be- 
tween 30 and 70% of the peak performance, our 
T3E code currently runs at about 10%. This 
might be explained by two disadvantages of the 
T3E for this kind of problems: the number of 
registers and the cache size of the DEC 21164 
microprocessors. The stream buffers which im- 
prove main memory access make the code about 
30% faster. 
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